Relative Rank Statistics for Dialog Analysis

نویسنده

Juan M. Huerta

چکیده

We introduce the relative rank differential statistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a simple method to establish semantic saliency in dialog, documents, and dialog segments using these word frequency rank statistics. Applications of our technique include the dynamic tracking of topic and semantic evolution in a dialog, topic detection, automatic generation of document tags, and new story or event detection in conversational speech and text. Our approach benefits from the robustness, simplicity and efficiency of non-parametric and rank based approaches and consistently outperformed term-frequency and TF-IDF cosine distance approaches in several experiments conducted.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bootstrap and fast double bootstrap tests of cointegration rank with financial time series

The likelihood ratio test of cointegration rank is the most widely used test for cointegration. Many studies have shown by simulation that the small sample distribution is not well approximated by the limiting distribution. We suggest using the bootstrap to generate small sample critical values instead of correcting the test statistics. The idea of bootstrapping the trace test of cointegration ...

متن کامل

Hypotheses ranking for robust domain classification and tracking in dialogue systems

We present a novel application of hypothesis ranking (HR) for the task of domain detection in a multi-domain, multiturn dialog system. Alternate, domain dependent, semantic frames from a spoken language understanding (SLU) analysis are ranked using a gradient boosted decision trees (GBDT) ranker to determine the most likely domain. The ranker, trained using Lambda Rank, makes use of a range of ...

متن کامل

Semantic tokenization of verbalized numbers in language modeling

In spoken dialog systems, number strings frequently carry crucial information such as DATE, TIME, and PRICE. Yet numbers are inherently difficult to recognize, partly because reliable statistics for training a language model is hard to obtain. In this paper, we take the advantage of the fact that dialog systems perform some form of semantic parsing. We use this parsing information to distinguis...

متن کامل

Rank tests and regression rank score tests in measurement error models

The rank and regression rank score tests of linear hypothesis in the linear regressionmodel are modified for measurement error models. The modified tests are still distribution free. Some tests of linear subhypotheses are invariant to the nuisance parameter, others are based on the aligned ranks using the R-estimators. The asymptotic relative efficiencies of tests with respect to tests in model...

متن کامل

Empirical Methods for Evaluating Dialog Systems

We examine what purpose a dialog metric serves and then propose empirical methods for evaluating systems that meet that purpose. The methods include a protocol for conducting a wizard-of-oz experiment and a basic set of descriptive statistics for substantiating performance claims using the data collected from the experiment as an ideal benchmark or “gold standard” for comparative judgments. The...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Relative Rank Statistics for Dialog Analysis

نویسنده

چکیده

منابع مشابه

Bootstrap and fast double bootstrap tests of cointegration rank with financial time series

Hypotheses ranking for robust domain classification and tracking in dialogue systems

Semantic tokenization of verbalized numbers in language modeling

Rank tests and regression rank score tests in measurement error models

Empirical Methods for Evaluating Dialog Systems

عنوان ژورنال:

اشتراک گذاری